Overview

Dataset statistics

Number of variables17
Number of observations10840
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory136.0 B

Variable types

Numeric9
Categorical8

Warnings

App has a high cardinality: 9659 distinct values High cardinality
Genres has a high cardinality: 119 distinct values High cardinality
Last Updated has a high cardinality: 1377 distinct values High cardinality
Current Ver has a high cardinality: 2831 distinct values High cardinality
Reviews is highly correlated with InstallsHigh correlation
Installs is highly correlated with ReviewsHigh correlation
Reviews_log is highly correlated with Installs_logHigh correlation
Installs_log is highly correlated with Reviews_logHigh correlation
Reviews is highly correlated with Installs and 2 other fieldsHigh correlation
Installs is highly correlated with Reviews and 2 other fieldsHigh correlation
Price is highly correlated with EarningsHigh correlation
Earnings is highly correlated with PriceHigh correlation
Reviews_log is highly correlated with Reviews and 2 other fieldsHigh correlation
Installs_log is highly correlated with Reviews and 2 other fieldsHigh correlation
df_index is highly correlated with Price and 1 other fieldsHigh correlation
Rating is highly correlated with Price and 1 other fieldsHigh correlation
Reviews is highly correlated with Installs and 4 other fieldsHigh correlation
Size is highly correlated with Price and 1 other fieldsHigh correlation
Installs is highly correlated with Reviews and 4 other fieldsHigh correlation
Price is highly correlated with df_index and 5 other fieldsHigh correlation
Earnings is highly correlated with df_index and 6 other fieldsHigh correlation
Reviews_log is highly correlated with Reviews and 3 other fieldsHigh correlation
Installs_log is highly correlated with Reviews and 2 other fieldsHigh correlation
df_index is highly correlated with Category and 1 other fieldsHigh correlation
Reviews is highly correlated with Installs and 1 other fieldsHigh correlation
Installs is highly correlated with Reviews and 2 other fieldsHigh correlation
Content Rating is highly correlated with CategoryHigh correlation
Size is highly correlated with Android VerHigh correlation
Android Ver is highly correlated with SizeHigh correlation
Reviews_log is highly correlated with Reviews and 2 other fieldsHigh correlation
Category is highly correlated with df_index and 1 other fieldsHigh correlation
Installs_log is highly correlated with df_index and 2 other fieldsHigh correlation
Price is highly skewed (γ1 = 23.70739238) Skewed
Earnings is highly skewed (γ1 = 57.6434555) Skewed
df_index is uniformly distributed Uniform
App is uniformly distributed Uniform
df_index has unique values Unique
Reviews has 596 (5.5%) zeros Zeros
Price has 10040 (92.6%) zeros Zeros
Earnings has 10050 (92.7%) zeros Zeros
Reviews_log has 596 (5.5%) zeros Zeros

Reproduction

Analysis started2021-08-29 19:45:13.087113
Analysis finished2021-08-29 19:45:32.444135
Duration19.36 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct10840
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5419.533948
Minimum0
Maximum10840
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:32.596145image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile541.95
Q12709.75
median5419.5
Q38129.25
95-th percentile10297.05
Maximum10840
Range10840
Interquartile range (IQR)5419.5

Descriptive statistics

Standard deviation3129.439605
Coefficient of variation (CV)0.5774370332
Kurtosis-1.199927086
Mean5419.533948
Median Absolute Deviation (MAD)2710
Skewness5.86343578 × 10-5
Sum58747748
Variance9793392.239
MonotonicityNot monotonic
2021-08-30T01:15:32.814166image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63191
 
< 0.1%
17351
 
< 0.1%
16651
 
< 0.1%
32221
 
< 0.1%
56731
 
< 0.1%
42571
 
< 0.1%
56551
 
< 0.1%
17261
 
< 0.1%
17301
 
< 0.1%
17331
 
< 0.1%
Other values (10830)10830
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
108401
< 0.1%
108391
< 0.1%
108381
< 0.1%
108371
< 0.1%
108361
< 0.1%
108351
< 0.1%
108341
< 0.1%
108331
< 0.1%
108321
< 0.1%
108311
< 0.1%

App
Categorical

HIGH CARDINALITY
UNIFORM

Distinct9659
Distinct (%)89.1%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
ROBLOX
 
9
CBS Sports App - Scores, News, Stats & Watch Live
 
8
Duolingo: Learn Languages Free
 
7
Candy Crush Saga
 
7
ESPN
 
7
Other values (9654)
10802 

Length

Max length194
Median length21
Mean length22.51743542
Min length1

Characters and Unicode

Total characters244089
Distinct characters478
Distinct categories18 ?
Distinct scripts16 ?
Distinct blocks29 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8861 ?
Unique (%)81.7%

Sample

1st rowBJ Bridge Standard American 2018
2nd rowThistletown CI
3rd rowFE Mechanical Engineering Prep
4th rowClarksburg AH
5th rowCJ DVD Rentals

Common Values

ValueCountFrequency (%)
ROBLOX9
 
0.1%
CBS Sports App - Scores, News, Stats & Watch Live8
 
0.1%
Duolingo: Learn Languages Free7
 
0.1%
Candy Crush Saga7
 
0.1%
ESPN7
 
0.1%
8 Ball Pool7
 
0.1%
slither.io6
 
0.1%
Zombie Catchers6
 
0.1%
Nick6
 
0.1%
Sniper 3D Gun Shooter: Free Shooting Games - FPS6
 
0.1%
Other values (9649)10771
99.4%

Length

2021-08-30T01:15:33.316201image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2823
 
6.6%
for560
 
1.3%
free514
 
1.2%
app332
 
0.8%
and283
 
0.7%
the270
 
0.6%
mobile222
 
0.5%
news195
 
0.5%
video194
 
0.5%
live194
 
0.5%
Other values (9548)36991
86.9%

Most occurring characters

ValueCountFrequency (%)
31738
 
13.0%
e19418
 
8.0%
a14836
 
6.1%
o13636
 
5.6%
r13241
 
5.4%
i12140
 
5.0%
t10334
 
4.2%
n9846
 
4.0%
s8716
 
3.6%
l8546
 
3.5%
Other values (468)101638
41.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter154480
63.3%
Uppercase Letter47206
 
19.3%
Space Separator31739
 
13.0%
Other Punctuation4082
 
1.7%
Dash Punctuation2675
 
1.1%
Decimal Number2381
 
1.0%
Close Punctuation350
 
0.1%
Open Punctuation346
 
0.1%
Other Letter341
 
0.1%
Other Symbol232
 
0.1%
Other values (8)257
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
ا13
 
3.8%
ر9
 
2.6%
ل7
 
2.1%
6
 
1.8%
6
 
1.8%
5
 
1.5%
ب5
 
1.5%
ق5
 
1.5%
ح5
 
1.5%
4
 
1.2%
Other values (214)276
80.9%
Lowercase Letter
ValueCountFrequency (%)
e19418
12.6%
a14836
 
9.6%
o13636
 
8.8%
r13241
 
8.6%
i12140
 
7.9%
t10334
 
6.7%
n9846
 
6.4%
s8716
 
5.6%
l8546
 
5.5%
c4924
 
3.2%
Other values (71)38843
25.1%
Uppercase Letter
ValueCountFrequency (%)
C4227
 
9.0%
S3850
 
8.2%
A3236
 
6.9%
B3040
 
6.4%
P2875
 
6.1%
F2839
 
6.0%
D2805
 
5.9%
T2716
 
5.8%
M2705
 
5.7%
E2500
 
5.3%
Other values (39)16413
34.8%
Other Symbol
ValueCountFrequency (%)
99
42.7%
®56
24.1%
🔥6
 
2.6%
🍀6
 
2.6%
5
 
2.2%
5
 
2.2%
°4
 
1.7%
4
 
1.7%
🎨3
 
1.3%
3
 
1.3%
Other values (34)41
17.7%
Other Punctuation
ValueCountFrequency (%)
:999
24.5%
&979
24.0%
,901
22.1%
.587
14.4%
'225
 
5.5%
!150
 
3.7%
/128
 
3.1%
?39
 
1.0%
#17
 
0.4%
"13
 
0.3%
Other values (9)44
 
1.1%
Decimal Number
ValueCountFrequency (%)
0463
19.4%
2452
19.0%
1440
18.5%
3279
11.7%
8203
8.5%
4175
 
7.3%
5112
 
4.7%
7111
 
4.7%
679
 
3.3%
967
 
2.8%
Math Symbol
ValueCountFrequency (%)
+133
67.9%
|51
 
26.0%
~5
 
2.6%
>2
 
1.0%
1
 
0.5%
1
 
0.5%
1
 
0.5%
×1
 
0.5%
1
 
0.5%
Nonspacing Mark
ValueCountFrequency (%)
5
38.5%
2
 
15.4%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Spacing Mark
ValueCountFrequency (%)
4
26.7%
2
13.3%
2
13.3%
ি2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Dash Punctuation
ValueCountFrequency (%)
-2502
93.5%
146
 
5.5%
21
 
0.8%
2
 
0.1%
2
 
0.1%
1
 
< 0.1%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
(317
91.6%
[22
 
6.4%
3
 
0.9%
3
 
0.9%
1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
)321
91.7%
]22
 
6.3%
3
 
0.9%
3
 
0.9%
1
 
0.3%
Final Punctuation
ValueCountFrequency (%)
19
90.5%
1
 
4.8%
»1
 
4.8%
Space Separator
ValueCountFrequency (%)
31738
> 99.9%
 1
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_2
100.0%
Modifier Symbol
ValueCountFrequency (%)
`1
100.0%
Modifier Letter
ValueCountFrequency (%)
8
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin201621
82.6%
Common42034
 
17.2%
Arabic86
 
< 0.1%
Hangul76
 
< 0.1%
Han70
 
< 0.1%
Katakana59
 
< 0.1%
Cyrillic58
 
< 0.1%
Bengali18
 
< 0.1%
Khmer13
 
< 0.1%
Ethiopic13
 
< 0.1%
Other values (6)41
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
31738
75.5%
-2502
 
6.0%
:999
 
2.4%
&979
 
2.3%
,901
 
2.1%
.587
 
1.4%
0463
 
1.1%
2452
 
1.1%
1440
 
1.0%
)321
 
0.8%
Other values (98)2652
 
6.3%
Latin
ValueCountFrequency (%)
e19418
 
9.6%
a14836
 
7.4%
o13636
 
6.8%
r13241
 
6.6%
i12140
 
6.0%
t10334
 
5.1%
n9846
 
4.9%
s8716
 
4.3%
l8546
 
4.2%
c4924
 
2.4%
Other values (87)85984
42.6%
Han
ValueCountFrequency (%)
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
1
 
1.4%
1
 
1.4%
1
 
1.4%
Other values (52)52
74.3%
Hangul
ValueCountFrequency (%)
6
 
7.9%
4
 
5.3%
3
 
3.9%
3
 
3.9%
2
 
2.6%
2
 
2.6%
2
 
2.6%
2
 
2.6%
2
 
2.6%
2
 
2.6%
Other values (47)48
63.2%
Katakana
ValueCountFrequency (%)
6
 
10.2%
5
 
8.5%
4
 
6.8%
3
 
5.1%
3
 
5.1%
3
 
5.1%
3
 
5.1%
3
 
5.1%
2
 
3.4%
2
 
3.4%
Other values (24)25
42.4%
Cyrillic
ValueCountFrequency (%)
и6
 
10.3%
е5
 
8.6%
о4
 
6.9%
в3
 
5.2%
л3
 
5.2%
с3
 
5.2%
Р3
 
5.2%
т3
 
5.2%
б2
 
3.4%
ъ2
 
3.4%
Other values (17)24
41.4%
Arabic
ValueCountFrequency (%)
ا13
15.1%
ر9
 
10.5%
ل7
 
8.1%
ب5
 
5.8%
ق5
 
5.8%
ح5
 
5.8%
ع4
 
4.7%
ن4
 
4.7%
ي4
 
4.7%
و4
 
4.7%
Other values (16)26
30.2%
Ethiopic
ValueCountFrequency (%)
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Other values (3)3
23.1%
Bengali
ValueCountFrequency (%)
4
22.2%
2
11.1%
2
11.1%
2
11.1%
ি2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Khmer
ValueCountFrequency (%)
2
15.4%
2
15.4%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
Devanagari
ValueCountFrequency (%)
2
18.2%
2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Hiragana
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Greek
ValueCountFrequency (%)
σ2
28.6%
β1
14.3%
Σ1
14.3%
Α1
14.3%
Ε1
14.3%
Κ1
14.3%
Hebrew
ValueCountFrequency (%)
ה1
20.0%
פ1
20.0%
ק1
20.0%
ו1
20.0%
ת1
20.0%
Myanmar
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Inherited
ValueCountFrequency (%)
5
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII242988
99.5%
Latin 1 Sup207
 
0.1%
Punctuation197
 
0.1%
Letterlike Symbols100
 
< 0.1%
Arabic86
 
< 0.1%
Hangul76
 
< 0.1%
CJK70
 
< 0.1%
None69
 
< 0.1%
Katakana67
 
< 0.1%
Cyrillic58
 
< 0.1%
Other values (19)171
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
31738
 
13.1%
e19418
 
8.0%
a14836
 
6.1%
o13636
 
5.6%
r13241
 
5.4%
i12140
 
5.0%
t10334
 
4.3%
n9846
 
4.1%
s8716
 
3.6%
l8546
 
3.5%
Other values (77)100537
41.4%
Letterlike Symbols
ValueCountFrequency (%)
99
99.0%
1
 
1.0%
Latin 1 Sup
ValueCountFrequency (%)
®56
27.1%
é33
15.9%
á19
 
9.2%
í19
 
9.2%
·10
 
4.8%
ü8
 
3.9%
ñ6
 
2.9%
ú6
 
2.9%
ö4
 
1.9%
ó4
 
1.9%
Other values (24)42
20.3%
Punctuation
ValueCountFrequency (%)
146
74.1%
21
 
10.7%
19
 
9.6%
5
 
2.5%
2
 
1.0%
1
 
0.5%
1
 
0.5%
1
 
0.5%
1
 
0.5%
Cyrillic
ValueCountFrequency (%)
и6
 
10.3%
е5
 
8.6%
о4
 
6.9%
в3
 
5.2%
л3
 
5.2%
с3
 
5.2%
Р3
 
5.2%
т3
 
5.2%
б2
 
3.4%
ъ2
 
3.4%
Other values (17)24
41.4%
Hangul
ValueCountFrequency (%)
6
 
7.9%
4
 
5.3%
3
 
3.9%
3
 
3.9%
2
 
2.6%
2
 
2.6%
2
 
2.6%
2
 
2.6%
2
 
2.6%
2
 
2.6%
Other values (47)48
63.2%
VS
ValueCountFrequency (%)
5
100.0%
None
ValueCountFrequency (%)
🔥6
 
8.7%
🍀6
 
8.7%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
🎨3
 
4.3%
💎2
 
2.9%
2
 
2.9%
Other values (31)35
50.7%
Katakana
ValueCountFrequency (%)
8
 
11.9%
6
 
9.0%
5
 
7.5%
4
 
6.0%
3
 
4.5%
3
 
4.5%
3
 
4.5%
3
 
4.5%
3
 
4.5%
2
 
3.0%
Other values (25)27
40.3%
CJK
ValueCountFrequency (%)
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
1
 
1.4%
1
 
1.4%
1
 
1.4%
Other values (52)52
74.3%
Math Operators
ValueCountFrequency (%)
1
50.0%
1
50.0%
Latin Ext A
ValueCountFrequency (%)
ı16
30.2%
İ14
26.4%
Ž4
 
7.5%
č3
 
5.7%
Š2
 
3.8%
ğ2
 
3.8%
ř2
 
3.8%
ž2
 
3.8%
š2
 
3.8%
ů1
 
1.9%
Other values (5)5
 
9.4%
Emoticons
ValueCountFrequency (%)
😘1
20.0%
😜1
20.0%
😄1
20.0%
😂1
20.0%
😍1
20.0%
Misc Symbols
ValueCountFrequency (%)
5
45.5%
4
36.4%
2
 
18.2%
Latin Ext Additional
ValueCountFrequency (%)
2
66.7%
1
33.3%
Arrows
ValueCountFrequency (%)
1
100.0%
Dingbats
ValueCountFrequency (%)
5
50.0%
3
30.0%
2
 
20.0%
Hiragana
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Geometric Shapes
ValueCountFrequency (%)
2
66.7%
1
33.3%
Bengali
ValueCountFrequency (%)
4
22.2%
2
11.1%
2
11.1%
2
11.1%
ি2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Arabic
ValueCountFrequency (%)
ا13
15.1%
ر9
 
10.5%
ل7
 
8.1%
ب5
 
5.8%
ق5
 
5.8%
ح5
 
5.8%
ع4
 
4.7%
ن4
 
4.7%
ي4
 
4.7%
و4
 
4.7%
Other values (16)26
30.2%
Devanagari
ValueCountFrequency (%)
2
18.2%
2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Enclosed Alphanum Sup
ValueCountFrequency (%)
🇺1
50.0%
🇸1
50.0%
IPA Ext
ValueCountFrequency (%)
ə2
100.0%
Misc Technical
ValueCountFrequency (%)
1
100.0%
Hebrew
ValueCountFrequency (%)
ה1
20.0%
פ1
20.0%
ק1
20.0%
ו1
20.0%
ת1
20.0%
Khmer
ValueCountFrequency (%)
2
15.4%
2
15.4%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
1
7.7%
Ethiopic
ValueCountFrequency (%)
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Other values (3)3
23.1%
Myanmar
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Category
Categorical

HIGH CORRELATION

Distinct33
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
FAMILY
1972 
GAME
1144 
TOOLS
843 
MEDICAL
 
463
BUSINESS
 
460
Other values (28)
5958 

Length

Max length19
Median length7
Mean length9.024446494
Min length4

Characters and Unicode

Total characters97825
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGAME
2nd rowPRODUCTIVITY
3rd rowFAMILY
4th rowMEDICAL
5th rowCOMMUNICATION

Common Values

ValueCountFrequency (%)
FAMILY1972
18.2%
GAME1144
 
10.6%
TOOLS843
 
7.8%
MEDICAL463
 
4.3%
BUSINESS460
 
4.2%
PRODUCTIVITY424
 
3.9%
PERSONALIZATION392
 
3.6%
COMMUNICATION387
 
3.6%
SPORTS384
 
3.5%
LIFESTYLE382
 
3.5%
Other values (23)3989
36.8%

Length

2021-08-30T01:15:33.719244image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
family1972
18.2%
game1144
 
10.6%
tools843
 
7.8%
medical463
 
4.3%
business460
 
4.2%
productivity424
 
3.9%
personalization392
 
3.6%
communication387
 
3.6%
sports384
 
3.5%
lifestyle382
 
3.5%
Other values (23)3989
36.8%

Most occurring characters

ValueCountFrequency (%)
A10424
 
10.7%
I8783
 
9.0%
E7958
 
8.1%
N7335
 
7.5%
O7125
 
7.3%
S6558
 
6.7%
L6189
 
6.3%
T5894
 
6.0%
M5155
 
5.3%
_3575
 
3.7%
Other values (14)28829
29.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter94250
96.3%
Connector Punctuation3575
 
3.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A10424
11.1%
I8783
 
9.3%
E7958
 
8.4%
N7335
 
7.8%
O7125
 
7.6%
S6558
 
7.0%
L6189
 
6.6%
T5894
 
6.3%
M5155
 
5.5%
D3556
 
3.8%
Other values (13)25273
26.8%
Connector Punctuation
ValueCountFrequency (%)
_3575
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin94250
96.3%
Common3575
 
3.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A10424
11.1%
I8783
 
9.3%
E7958
 
8.4%
N7335
 
7.8%
O7125
 
7.6%
S6558
 
7.0%
L6189
 
6.6%
T5894
 
6.3%
M5155
 
5.5%
D3556
 
3.8%
Other values (13)25273
26.8%
Common
ValueCountFrequency (%)
_3575
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII97825
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A10424
 
10.7%
I8783
 
9.0%
E7958
 
8.1%
N7335
 
7.5%
O7125
 
7.3%
S6558
 
6.7%
L6189
 
6.3%
T5894
 
6.0%
M5155
 
5.3%
_3575
 
3.7%
Other values (14)28829
29.5%

Rating
Real number (ℝ≥0)

HIGH CORRELATION

Distinct39
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.206476015
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:33.883258image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.3
Q14.1
median4.3
Q34.5
95-th percentile4.8
Maximum5
Range4
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation0.4803419861
Coefficient of variation (CV)0.1141910674
Kurtosis7.295937667
Mean4.206476015
Median Absolute Deviation (MAD)0.2
Skewness-2.062518556
Sum45598.2
Variance0.2307284236
MonotonicityIncreasing
2021-08-30T01:15:34.054270image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
4.32550
23.5%
4.41109
10.2%
4.51038
9.6%
4.2952
 
8.8%
4.6823
 
7.6%
4.1708
 
6.5%
4568
 
5.2%
4.7499
 
4.6%
3.9386
 
3.6%
3.8303
 
2.8%
Other values (29)1904
17.6%
ValueCountFrequency (%)
116
0.1%
1.21
 
< 0.1%
1.43
 
< 0.1%
1.53
 
< 0.1%
1.64
 
< 0.1%
1.78
0.1%
1.88
0.1%
1.913
0.1%
212
0.1%
2.18
0.1%
ValueCountFrequency (%)
5274
 
2.5%
4.987
 
0.8%
4.8234
 
2.2%
4.7499
 
4.6%
4.6823
 
7.6%
4.51038
9.6%
4.41109
10.2%
4.32550
23.5%
4.2952
 
8.8%
4.1708
 
6.5%

Reviews
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6001
Distinct (%)55.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean444152.896
Minimum0
Maximum78158306
Zeros596
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:34.244273image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q138
median2094
Q354775.5
95-th percentile1462042.65
Maximum78158306
Range78158306
Interquartile range (IQR)54737.5

Descriptive statistics

Standard deviation2927760.604
Coefficient of variation (CV)6.591785464
Kurtosis341.0603558
Mean444152.896
Median Absolute Deviation (MAD)2094
Skewness16.44958434
Sum4814617393
Variance8.571782154 × 1012
MonotonicityNot monotonic
2021-08-30T01:15:34.437287image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0596
 
5.5%
1272
 
2.5%
2214
 
2.0%
3175
 
1.6%
4137
 
1.3%
5108
 
1.0%
697
 
0.9%
790
 
0.8%
874
 
0.7%
965
 
0.6%
Other values (5991)9012
83.1%
ValueCountFrequency (%)
0596
5.5%
1272
2.5%
2214
 
2.0%
3175
 
1.6%
4137
 
1.3%
5108
 
1.0%
697
 
0.9%
790
 
0.8%
874
 
0.7%
965
 
0.6%
ValueCountFrequency (%)
781583061
< 0.1%
781282081
< 0.1%
691193162
< 0.1%
691096721
< 0.1%
665774461
< 0.1%
665773132
< 0.1%
665099171
< 0.1%
566465781
< 0.1%
566428472
< 0.1%
448938881
< 0.1%

Size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct460
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.51550853
Minimum0.0085
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:34.630302image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.0085
5-th percentile1.6
Q15.9
median18
Q326
95-th percentile69
Maximum100
Range99.9915
Interquartile range (IQR)20.1

Descriptive statistics

Standard deviation20.74749471
Coefficient of variation (CV)0.9643041753
Kurtosis2.836350148
Mean21.51550853
Median Absolute Deviation (MAD)11
Skewness1.695678367
Sum233228.1125
Variance430.4585368
MonotonicityNot monotonic
2021-08-30T01:15:34.820317image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.511695
 
15.6%
11198
 
1.8%
12196
 
1.8%
14194
 
1.8%
13191
 
1.8%
15184
 
1.7%
17160
 
1.5%
19154
 
1.4%
16149
 
1.4%
26149
 
1.4%
Other values (450)7570
69.8%
ValueCountFrequency (%)
0.00851
< 0.1%
0.0111
< 0.1%
0.0141
< 0.1%
0.0172
< 0.1%
0.0182
< 0.1%
0.021
< 0.1%
0.0231
< 0.1%
0.0241
< 0.1%
0.0251
< 0.1%
0.0262
< 0.1%
ValueCountFrequency (%)
10016
0.1%
9939
0.4%
9816
0.1%
9720
0.2%
9626
0.2%
9518
0.2%
9417
0.2%
9316
0.1%
9215
 
0.1%
9122
0.2%

Installs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15464338.88
Minimum0
Maximum1000000000
Zeros15
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:35.013332image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q11000
median100000
Q35000000
95-th percentile50000000
Maximum1000000000
Range1000000000
Interquartile range (IQR)4999000

Descriptive statistics

Standard deviation85029361.4
Coefficient of variation (CV)5.498415551
Kurtosis100.280006
Mean15464338.88
Median Absolute Deviation (MAD)99990
Skewness9.572066755
Sum1.676334335 × 1011
Variance7.229992299 × 1015
MonotonicityNot monotonic
2021-08-30T01:15:35.189345image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
10000001579
14.6%
100000001252
11.5%
1000001169
10.8%
100001054
9.7%
1000907
8.4%
5000000752
 
6.9%
100719
 
6.6%
500000539
 
5.0%
50000479
 
4.4%
5000477
 
4.4%
Other values (10)1913
17.6%
ValueCountFrequency (%)
015
 
0.1%
167
 
0.6%
582
 
0.8%
10386
 
3.6%
50205
 
1.9%
100719
6.6%
500330
 
3.0%
1000907
8.4%
5000477
4.4%
100001054
9.7%
ValueCountFrequency (%)
100000000058
 
0.5%
50000000072
 
0.7%
100000000409
 
3.8%
50000000289
 
2.7%
100000001252
11.5%
5000000752
6.9%
10000001579
14.6%
500000539
 
5.0%
1000001169
10.8%
50000479
 
4.4%

Type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
Free
10040 
Paid
 
800

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters43360
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFree
2nd rowFree
3rd rowFree
4th rowFree
5th rowFree

Common Values

ValueCountFrequency (%)
Free10040
92.6%
Paid800
 
7.4%

Length

2021-08-30T01:15:35.510369image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-30T01:15:35.612377image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
free10040
92.6%
paid800
 
7.4%

Most occurring characters

ValueCountFrequency (%)
e20080
46.3%
F10040
23.2%
r10040
23.2%
P800
 
1.8%
a800
 
1.8%
i800
 
1.8%
d800
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter32520
75.0%
Uppercase Letter10840
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e20080
61.7%
r10040
30.9%
a800
 
2.5%
i800
 
2.5%
d800
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
F10040
92.6%
P800
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
Latin43360
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e20080
46.3%
F10040
23.2%
r10040
23.2%
P800
 
1.8%
a800
 
1.8%
i800
 
1.8%
d800
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII43360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e20080
46.3%
F10040
23.2%
r10040
23.2%
P800
 
1.8%
a800
 
1.8%
i800
 
1.8%
d800
 
1.8%

Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct92
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.027368081
Minimum0
Maximum400
Zeros10040
Zeros (%)92.6%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:35.752388image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1.99
Maximum400
Range400
Interquartile range (IQR)0

Descriptive statistics

Standard deviation15.94970347
Coefficient of variation (CV)15.52481896
Kurtosis578.1429983
Mean1.027368081
Median Absolute Deviation (MAD)0
Skewness23.70739238
Sum11136.67
Variance254.3930408
MonotonicityNot monotonic
2021-08-30T01:15:35.972418image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010040
92.6%
0.99148
 
1.4%
2.99129
 
1.2%
1.9973
 
0.7%
4.9972
 
0.7%
3.9963
 
0.6%
1.4946
 
0.4%
5.9930
 
0.3%
2.4926
 
0.2%
9.9921
 
0.2%
Other values (82)192
 
1.8%
ValueCountFrequency (%)
010040
92.6%
0.99148
 
1.4%
13
 
< 0.1%
1.041
 
< 0.1%
1.21
 
< 0.1%
1.261
 
< 0.1%
1.291
 
< 0.1%
1.4946
 
0.4%
1.51
 
< 0.1%
1.591
 
< 0.1%
ValueCountFrequency (%)
4001
 
< 0.1%
399.9912
0.1%
394.991
 
< 0.1%
389.991
 
< 0.1%
379.991
 
< 0.1%
299.991
 
< 0.1%
2001
 
< 0.1%
154.991
 
< 0.1%
109.991
 
< 0.1%
89.991
 
< 0.1%

Content Rating
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
Everyone
8714 
Teen
1208 
Mature 17+
 
499
Everyone 10+
 
414
Adults only 18+
 
3

Length

Max length15
Median length8
Mean length7.800830258
Min length4

Characters and Unicode

Total characters84561
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEveryone
2nd rowEveryone
3rd rowEveryone
4th rowEveryone
5th rowEveryone

Common Values

ValueCountFrequency (%)
Everyone8714
80.4%
Teen1208
 
11.1%
Mature 17+499
 
4.6%
Everyone 10+414
 
3.8%
Adults only 18+3
 
< 0.1%
Unrated2
 
< 0.1%

Length

2021-08-30T01:15:36.286443image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-30T01:15:36.385436image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
everyone9128
77.6%
teen1208
 
10.3%
mature499
 
4.2%
17499
 
4.2%
10414
 
3.5%
adults3
 
< 0.1%
only3
 
< 0.1%
183
 
< 0.1%
unrated2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e21173
25.0%
n10341
12.2%
r9629
11.4%
y9131
10.8%
o9131
10.8%
E9128
10.8%
v9128
10.8%
T1208
 
1.4%
919
 
1.1%
1916
 
1.1%
Other values (13)3857
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter70054
82.8%
Uppercase Letter10840
 
12.8%
Decimal Number1832
 
2.2%
Space Separator919
 
1.1%
Math Symbol916
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e21173
30.2%
n10341
14.8%
r9629
13.7%
y9131
13.0%
o9131
13.0%
v9128
13.0%
t504
 
0.7%
u502
 
0.7%
a501
 
0.7%
l6
 
< 0.1%
Other values (2)8
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
E9128
84.2%
T1208
 
11.1%
M499
 
4.6%
A3
 
< 0.1%
U2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1916
50.0%
7499
27.2%
0414
22.6%
83
 
0.2%
Space Separator
ValueCountFrequency (%)
919
100.0%
Math Symbol
ValueCountFrequency (%)
+916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin80894
95.7%
Common3667
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e21173
26.2%
n10341
12.8%
r9629
11.9%
y9131
11.3%
o9131
11.3%
E9128
11.3%
v9128
11.3%
T1208
 
1.5%
t504
 
0.6%
u502
 
0.6%
Other values (7)1019
 
1.3%
Common
ValueCountFrequency (%)
919
25.1%
1916
25.0%
+916
25.0%
7499
13.6%
0414
11.3%
83
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII84561
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e21173
25.0%
n10341
12.2%
r9629
11.4%
y9131
10.8%
o9131
10.8%
E9128
10.8%
v9128
10.8%
T1208
 
1.4%
919
 
1.1%
1916
 
1.1%
Other values (13)3857
 
4.6%

Genres
Categorical

HIGH CARDINALITY

Distinct119
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
Tools
842 
Entertainment
 
623
Education
 
549
Medical
 
463
Business
 
460
Other values (114)
7903 

Length

Max length37
Median length9
Mean length10.42204797
Min length4

Characters and Unicode

Total characters112975
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)0.2%

Sample

1st rowCard
2nd rowProductivity
3rd rowEducation
4th rowMedical
5th rowCommunication

Common Values

ValueCountFrequency (%)
Tools842
 
7.8%
Entertainment623
 
5.7%
Education549
 
5.1%
Medical463
 
4.3%
Business460
 
4.2%
Productivity424
 
3.9%
Sports398
 
3.7%
Personalization392
 
3.6%
Communication387
 
3.6%
Lifestyle381
 
3.5%
Other values (109)5921
54.6%

Length

2021-08-30T01:15:36.772466image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2073
 
13.4%
tools842
 
5.5%
entertainment623
 
4.0%
education549
 
3.6%
medical463
 
3.0%
business460
 
3.0%
productivity424
 
2.7%
sports398
 
2.6%
personalization392
 
2.5%
communication387
 
2.5%
Other values (125)8836
57.2%

Most occurring characters

ValueCountFrequency (%)
i9816
 
8.7%
o9089
 
8.0%
e8939
 
7.9%
n8905
 
7.9%
t8623
 
7.6%
a8618
 
7.6%
s6282
 
5.6%
4607
 
4.1%
l4601
 
4.1%
r4532
 
4.0%
Other values (33)38963
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter91925
81.4%
Uppercase Letter13872
 
12.3%
Space Separator4607
 
4.1%
Other Punctuation2571
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i9816
10.7%
o9089
9.9%
e8939
9.7%
n8905
9.7%
t8623
9.4%
a8618
9.4%
s6282
 
6.8%
l4601
 
5.0%
r4532
 
4.9%
c4395
 
4.8%
Other values (13)18125
19.7%
Uppercase Letter
ValueCountFrequency (%)
P1859
13.4%
E1782
12.8%
S1286
9.3%
A1141
8.2%
T1140
8.2%
M956
 
6.9%
B880
 
6.3%
C845
 
6.1%
F836
 
6.0%
L726
 
5.2%
Other values (7)2421
17.5%
Other Punctuation
ValueCountFrequency (%)
&2073
80.6%
;498
 
19.4%
Space Separator
ValueCountFrequency (%)
4607
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin105797
93.6%
Common7178
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i9816
 
9.3%
o9089
 
8.6%
e8939
 
8.4%
n8905
 
8.4%
t8623
 
8.2%
a8618
 
8.1%
s6282
 
5.9%
l4601
 
4.3%
r4532
 
4.3%
c4395
 
4.2%
Other values (30)31997
30.2%
Common
ValueCountFrequency (%)
4607
64.2%
&2073
28.9%
;498
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII112975
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i9816
 
8.7%
o9089
 
8.0%
e8939
 
7.9%
n8905
 
7.9%
t8623
 
7.6%
a8618
 
7.6%
s6282
 
5.6%
4607
 
4.1%
l4601
 
4.1%
r4532
 
4.0%
Other values (33)38963
34.5%

Last Updated
Categorical

HIGH CARDINALITY

Distinct1377
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
August 3, 2018
 
326
August 2, 2018
 
304
July 31, 2018
 
294
August 1, 2018
 
285
July 30, 2018
 
211
Other values (1372)
9420 

Length

Max length18
Median length13
Mean length13.87629151
Min length11

Characters and Unicode

Total characters150419
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique423 ?
Unique (%)3.9%

Sample

1st rowMay 21, 2018
2nd rowMarch 15, 2018
3rd rowJuly 27, 2018
4th rowMay 1, 2017
5th rowOctober 6, 2017

Common Values

ValueCountFrequency (%)
August 3, 2018326
 
3.0%
August 2, 2018304
 
2.8%
July 31, 2018294
 
2.7%
August 1, 2018285
 
2.6%
July 30, 2018211
 
1.9%
July 25, 2018164
 
1.5%
July 26, 2018161
 
1.5%
August 6, 2018158
 
1.5%
July 27, 2018151
 
1.4%
July 24, 2018148
 
1.4%
Other values (1367)8638
79.7%

Length

2021-08-30T01:15:37.150509image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20187349
22.6%
july3163
 
9.7%
20171867
 
5.7%
august1594
 
4.9%
june1273
 
3.9%
may978
 
3.0%
2016804
 
2.5%
march667
 
2.1%
april616
 
1.9%
3603
 
1.9%
Other values (42)13606
41.8%

Most occurring characters

ValueCountFrequency (%)
21680
14.4%
215426
 
10.3%
115321
 
10.2%
011819
 
7.9%
,10840
 
7.2%
u8648
 
5.7%
88202
 
5.5%
e5198
 
3.5%
y5165
 
3.4%
J4927
 
3.3%
Other values (28)43193
28.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number61425
40.8%
Lowercase Letter45634
30.3%
Space Separator21680
 
14.4%
Uppercase Letter10840
 
7.2%
Other Punctuation10840
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u8648
19.0%
e5198
11.4%
y5165
11.3%
r4365
9.6%
l3779
8.3%
a3160
 
6.9%
t2306
 
5.1%
b2058
 
4.5%
n1764
 
3.9%
g1594
 
3.5%
Other values (8)7597
16.6%
Decimal Number
ValueCountFrequency (%)
215426
25.1%
115321
24.9%
011819
19.2%
88202
13.4%
72838
 
4.6%
32215
 
3.6%
61996
 
3.2%
51541
 
2.5%
41131
 
1.8%
9936
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
J4927
45.5%
A2210
20.4%
M1645
 
15.2%
F533
 
4.9%
D426
 
3.9%
O398
 
3.7%
N387
 
3.6%
S314
 
2.9%
Space Separator
ValueCountFrequency (%)
21680
100.0%
Other Punctuation
ValueCountFrequency (%)
,10840
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common93945
62.5%
Latin56474
37.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
u8648
15.3%
e5198
 
9.2%
y5165
 
9.1%
J4927
 
8.7%
r4365
 
7.7%
l3779
 
6.7%
a3160
 
5.6%
t2306
 
4.1%
A2210
 
3.9%
b2058
 
3.6%
Other values (16)14658
26.0%
Common
ValueCountFrequency (%)
21680
23.1%
215426
16.4%
115321
16.3%
011819
12.6%
,10840
11.5%
88202
 
8.7%
72838
 
3.0%
32215
 
2.4%
61996
 
2.1%
51541
 
1.6%
Other values (2)2067
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII150419
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21680
14.4%
215426
 
10.3%
115321
 
10.2%
011819
 
7.9%
,10840
 
7.2%
u8648
 
5.7%
88202
 
5.5%
e5198
 
3.5%
y5165
 
3.4%
J4927
 
3.3%
Other values (28)43193
28.7%

Current Ver
Categorical

HIGH CARDINALITY

Distinct2831
Distinct (%)26.1%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
Varies with device
1467 
1.0
809 
1.1
 
264
1.2
 
178
2.0
 
151
Other values (2826)
7971 

Length

Max length50
Median length5
Mean length6.881549815
Min length1

Characters and Unicode

Total characters74596
Distinct characters79
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1810 ?
Unique (%)16.7%

Sample

1st row6.2-sayc
2nd row41.9
3rd row5.33.3669
4th row300000.0.81
5th row1.0

Common Values

ValueCountFrequency (%)
Varies with device1467
 
13.5%
1.0809
 
7.5%
1.1264
 
2.4%
1.2178
 
1.6%
2.0151
 
1.4%
1.3145
 
1.3%
1.0.0136
 
1.3%
1.0.1119
 
1.1%
1.488
 
0.8%
1.581
 
0.7%
Other values (2821)7402
68.3%

Length

2021-08-30T01:15:37.559530image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
device1467
 
10.5%
varies1467
 
10.5%
with1467
 
10.5%
1.0814
 
5.8%
1.1266
 
1.9%
1.2178
 
1.3%
2.0154
 
1.1%
1.3147
 
1.1%
1.0.0136
 
1.0%
1.0.1120
 
0.9%
Other values (2867)7734
55.4%

Most occurring characters

ValueCountFrequency (%)
.15513
20.8%
18916
12.0%
05818
 
7.8%
e4532
 
6.1%
i4488
 
6.0%
24316
 
5.8%
3110
 
4.2%
32771
 
3.7%
42164
 
2.9%
51729
 
2.3%
Other values (69)21239
28.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30681
41.1%
Lowercase Letter23223
31.1%
Other Punctuation15526
20.8%
Space Separator3110
 
4.2%
Uppercase Letter1770
 
2.4%
Dash Punctuation133
 
0.2%
Connector Punctuation61
 
0.1%
Open Punctuation39
 
0.1%
Close Punctuation39
 
0.1%
Math Symbol12
 
< 0.1%
Other values (2)2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e4532
19.5%
i4488
19.3%
a1587
 
6.8%
r1573
 
6.8%
d1559
 
6.7%
v1525
 
6.6%
s1514
 
6.5%
t1514
 
6.5%
c1508
 
6.5%
w1485
 
6.4%
Other values (16)1938
8.3%
Uppercase Letter
ValueCountFrequency (%)
V1487
84.0%
A31
 
1.8%
C24
 
1.4%
R24
 
1.4%
B21
 
1.2%
P21
 
1.2%
E18
 
1.0%
S18
 
1.0%
T16
 
0.9%
D15
 
0.8%
Other values (15)95
 
5.4%
Decimal Number
ValueCountFrequency (%)
18916
29.1%
05818
19.0%
24316
14.1%
32771
 
9.0%
42164
 
7.1%
51729
 
5.6%
61437
 
4.7%
71339
 
4.4%
81169
 
3.8%
91022
 
3.3%
Other Punctuation
ValueCountFrequency (%)
.15513
99.9%
/4
 
< 0.1%
,4
 
< 0.1%
:3
 
< 0.1%
;1
 
< 0.1%
'1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-132
99.2%
1
 
0.8%
Open Punctuation
ValueCountFrequency (%)
(38
97.4%
[1
 
2.6%
Close Punctuation
ValueCountFrequency (%)
)38
97.4%
]1
 
2.6%
Math Symbol
ValueCountFrequency (%)
+9
75.0%
|3
 
25.0%
Space Separator
ValueCountFrequency (%)
3110
100.0%
Connector Punctuation
ValueCountFrequency (%)
_61
100.0%
Other Symbol
ValueCountFrequency (%)
®1
100.0%
Other Number
ValueCountFrequency (%)
³1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common49603
66.5%
Latin24993
33.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e4532
18.1%
i4488
18.0%
a1587
 
6.3%
r1573
 
6.3%
d1559
 
6.2%
v1525
 
6.1%
s1514
 
6.1%
t1514
 
6.1%
c1508
 
6.0%
V1487
 
5.9%
Other values (41)3706
14.8%
Common
ValueCountFrequency (%)
.15513
31.3%
18916
18.0%
05818
 
11.7%
24316
 
8.7%
3110
 
6.3%
32771
 
5.6%
42164
 
4.4%
51729
 
3.5%
61437
 
2.9%
71339
 
2.7%
Other values (18)2490
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII74592
> 99.9%
Latin 1 Sup3
 
< 0.1%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.15513
20.8%
18916
12.0%
05818
 
7.8%
e4532
 
6.1%
i4488
 
6.0%
24316
 
5.8%
3110
 
4.2%
32771
 
3.7%
42164
 
2.9%
51729
 
2.3%
Other values (65)21235
28.5%
Punctuation
ValueCountFrequency (%)
1
100.0%
Latin 1 Sup
ValueCountFrequency (%)
®1
33.3%
Ã1
33.3%
³1
33.3%

Android Ver
Categorical

HIGH CORRELATION

Distinct33
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size84.8 KiB
4.1
2453 
4.0.3
1501 
4.0
1375 
Varies with device
1362 
4.4
980 
Other values (28)
3169 

Length

Max length18
Median length4
Mean length6.095848708
Min length4

Characters and Unicode

Total characters66079
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row4.0
2nd row4.1
3rd row5.0
4th row4.0.3
5th row4.1

Common Values

ValueCountFrequency (%)
4.1 2453
22.6%
4.0.3 1501
13.8%
4.0 1375
12.7%
Varies with device1362
12.6%
4.4 980
 
9.0%
2.3 652
 
6.0%
5.0 601
 
5.5%
4.2 394
 
3.6%
2.3.3 281
 
2.6%
2.2 244
 
2.3%
Other values (23)997
9.2%

Length

2021-08-30T01:15:37.943559image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4.12454
18.1%
4.0.31503
11.1%
4.01375
10.1%
varies1362
10.0%
with1362
10.0%
device1362
10.0%
4.4980
 
7.2%
2.3652
 
4.8%
5.0605
 
4.5%
4.2394
 
2.9%
Other values (20)1533
11.3%

Most occurring characters

ValueCountFrequency (%)
12211
18.5%
.11284
17.1%
47953
12.0%
i4086
 
6.2%
e4086
 
6.2%
03877
 
5.9%
33247
 
4.9%
12782
 
4.2%
22026
 
3.1%
V1362
 
2.1%
Other values (15)13165
19.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number20771
31.4%
Lowercase Letter20430
30.9%
Space Separator12211
18.5%
Other Punctuation11284
17.1%
Uppercase Letter1374
 
2.1%
Dash Punctuation9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i4086
20.0%
e4086
20.0%
a1362
 
6.7%
r1362
 
6.7%
s1362
 
6.7%
w1362
 
6.7%
t1362
 
6.7%
h1362
 
6.7%
d1362
 
6.7%
v1362
 
6.7%
Decimal Number
ValueCountFrequency (%)
47953
38.3%
03877
18.7%
33247
15.6%
12782
 
13.4%
22026
 
9.8%
5649
 
3.1%
6177
 
0.9%
752
 
0.3%
88
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
V1362
99.1%
W12
 
0.9%
Other Punctuation
ValueCountFrequency (%)
.11284
100.0%
Space Separator
ValueCountFrequency (%)
12211
100.0%
Dash Punctuation
ValueCountFrequency (%)
-9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common44275
67.0%
Latin21804
33.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i4086
18.7%
e4086
18.7%
V1362
 
6.2%
a1362
 
6.2%
r1362
 
6.2%
s1362
 
6.2%
w1362
 
6.2%
t1362
 
6.2%
h1362
 
6.2%
d1362
 
6.2%
Other values (3)2736
12.5%
Common
ValueCountFrequency (%)
12211
27.6%
.11284
25.5%
47953
18.0%
03877
 
8.8%
33247
 
7.3%
12782
 
6.3%
22026
 
4.6%
5649
 
1.5%
6177
 
0.4%
752
 
0.1%
Other values (2)17
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII66079
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12211
18.5%
.11284
17.1%
47953
12.0%
i4086
 
6.2%
e4086
 
6.2%
03877
 
5.9%
33247
 
4.9%
12782
 
4.2%
22026
 
3.1%
V1362
 
2.1%
Other values (15)13165
19.9%

Earnings
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct230
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35851.42489
Minimum0
Maximum69900000
Zeros10050
Zeros (%)92.7%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:38.152572image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile990
Maximum69900000
Range69900000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1065979.51
Coefficient of variation (CV)29.73325366
Kurtosis3598.540288
Mean35851.42489
Median Absolute Deviation (MAD)0
Skewness57.6434555
Sum388629445.8
Variance1.136312316 × 1012
MonotonicityNot monotonic
2021-08-30T01:15:38.385593image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010050
92.7%
2990029
 
0.3%
99026
 
0.2%
29900025
 
0.2%
9.924
 
0.2%
9922
 
0.2%
299017
 
0.2%
49900016
 
0.1%
990016
 
0.1%
1990015
 
0.1%
Other values (220)600
 
5.5%
ValueCountFrequency (%)
010050
92.7%
0.998
 
0.1%
1.494
 
< 0.1%
1.991
 
< 0.1%
2.491
 
< 0.1%
2.992
 
< 0.1%
3.991
 
< 0.1%
4.955
 
< 0.1%
5.992
 
< 0.1%
7.451
 
< 0.1%
ValueCountFrequency (%)
699000002
< 0.1%
399990001
 
< 0.1%
199995001
 
< 0.1%
99000001
 
< 0.1%
69900001
 
< 0.1%
59900004
< 0.1%
49900001
 
< 0.1%
40000001
 
< 0.1%
39999002
< 0.1%
38999001
 
< 0.1%

Reviews_log
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6001
Distinct (%)55.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.211636201
Minimum0
Maximum7.892975143
Zeros596
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:38.631613image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.591064607
median3.321184027
Q34.738594267
95-th percentile6.16496011
Maximum7.892975143
Range7.892975143
Interquartile range (IQR)3.14752966

Descriptive statistics

Standard deviation1.902933741
Coefficient of variation (CV)0.5925122341
Kurtosis-1.038113864
Mean3.211636201
Median Absolute Deviation (MAD)1.552049353
Skewness-0.0156730911
Sum34814.13642
Variance3.621156821
MonotonicityNot monotonic
2021-08-30T01:15:38.855627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0596
 
5.5%
0.3010299957272
 
2.5%
0.4771212547214
 
2.0%
0.6020599913175
 
1.6%
0.6989700043137
 
1.3%
0.7781512504108
 
1.0%
0.8450980497
 
0.9%
0.90308998790
 
0.8%
0.954242509474
 
0.7%
165
 
0.6%
Other values (5991)9012
83.1%
ValueCountFrequency (%)
0596
5.5%
0.3010299957272
2.5%
0.4771212547214
 
2.0%
0.6020599913175
 
1.6%
0.6989700043137
 
1.3%
0.7781512504108
 
1.0%
0.8450980497
 
0.9%
0.90308998790
 
0.8%
0.954242509474
 
0.7%
165
 
0.6%
ValueCountFrequency (%)
7.8929751431
< 0.1%
7.8928078691
< 0.1%
7.8395994382
< 0.1%
7.8395388381
< 0.1%
7.8233271381
< 0.1%
7.823326272
< 0.1%
7.8228864121
< 0.1%
7.7531736871
< 0.1%
7.7531450822
< 0.1%
7.6521872281
< 0.1%

Installs_log
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.885467305
Minimum0
Maximum9
Zeros15
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size84.8 KiB
2021-08-30T01:15:39.057642image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.041392685
Q13.000434077
median5.000004343
Q36.698970091
95-th percentile7.698970013
Maximum9
Range9
Interquartile range (IQR)3.698536014

Descriptive statistics

Standard deviation1.966001418
Coefficient of variation (CV)0.4024182939
Kurtosis-0.7697792436
Mean4.885467305
Median Absolute Deviation (MAD)1.698965748
Skewness-0.3037457808
Sum52958.46559
Variance3.865161576
MonotonicityNot monotonic
2021-08-30T01:15:39.195653image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
6.0000004341579
14.6%
7.0000000431252
11.5%
5.0000043431169
10.8%
4.0000434271054
9.7%
3.000434077907
8.4%
6.698970091752
 
6.9%
2.004321374719
 
6.6%
5.698970873539
 
5.0%
4.69897869479
 
4.4%
3.699056855477
 
4.4%
Other values (10)1913
17.6%
ValueCountFrequency (%)
015
 
0.1%
0.301029995767
 
0.6%
0.778151250482
 
0.8%
1.041392685386
 
3.6%
1.707570176205
 
1.9%
2.004321374719
6.6%
2.699837726330
 
3.0%
3.000434077907
8.4%
3.699056855477
4.4%
4.0000434271054
9.7%
ValueCountFrequency (%)
958
 
0.5%
8.69897000572
 
0.7%
8.000000004409
 
3.8%
7.698970013289
 
2.7%
7.0000000431252
11.5%
6.698970091752
6.9%
6.0000004341579
14.6%
5.698970873539
 
5.0%
5.0000043431169
10.8%
4.69897869479
 
4.4%

Interactions

2021-08-30T01:15:16.963438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:17.148454image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:17.319464image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:17.491479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:17.640490image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:17.792504image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:17.951529image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:18.108540image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:18.267553image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:18.420550image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:18.592563image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:18.773577image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:18.949605image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:19.114618image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:19.282631image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:19.460631image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:19.639644image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:19.814657image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:19.978684image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:20.144696image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:20.319709image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:20.494724image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:20.665723image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:20.837736image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:21.010751image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:21.183778image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:21.361790image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:21.539794image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:21.708804image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:21.891822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:22.067832image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:22.233846image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:22.401861image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:22.573870image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:22.760887image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:22.969900image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:23.145914image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:24.405516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:24.569529image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:24.732556image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:24.887552image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:25.047568image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:25.209578image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:25.371604image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:25.534617image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:25.685628image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:25.866642image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:26.089645image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:26.272673image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:26.435686image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:26.600698image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:26.770712image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:26.945711image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:27.124725image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:27.286737image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:27.448750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:27.620777image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:27.790791image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:27.950803image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:28.113815image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:28.282828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:28.449841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:28.623854image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:28.782868image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:28.947880image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:29.122892image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:29.297006image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:29.462919image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:29.627931image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:29.802944image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:29.973960image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:30.149971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:30.310984image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:30.458995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:30.617186image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:30.771020image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:30.919030image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:31.065042image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:31.218053image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:31.371065image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-30T01:15:31.524077image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-08-30T01:15:39.354666image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-30T01:15:39.571681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-30T01:15:39.784698image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-30T01:15:40.893796image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-30T01:15:41.110813image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-30T01:15:31.860089image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-30T01:15:32.279121image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexAppCategoryRatingReviewsSizeInstallsTypePriceContent RatingGenresLast UpdatedCurrent VerAndroid VerEarningsReviews_logInstalls_log
06319BJ Bridge Standard American 2018GAME1.014.91000.0Free0.0EveryoneCardMay 21, 20186.2-sayc4.00.00.3010303.000434
17383Thistletown CIPRODUCTIVITY1.016.6100.0Free0.0EveryoneProductivityMarch 15, 201841.94.10.00.3010302.004321
210324FE Mechanical Engineering PrepFAMILY1.0221.01000.0Free0.0EveryoneEducationJuly 27, 20185.33.36695.00.00.4771213.000434
35151Clarksburg AHMEDICAL1.0128.050.0Free0.0EveryoneMedicalMay 1, 2017300000.0.814.0.30.00.3010301.707570
47427CJ DVD RentalsCOMMUNICATION1.0513.0100.0Free0.0EveryoneCommunicationOctober 6, 20171.04.10.00.7781512.004321
57806CR MagazineBUSINESS1.017.8100.0Free0.0EveryoneBusinessJuly 23, 20142.4.22.3.30.00.3010302.004321
68875DT future1 camTOOLS1.0124.050.0Free0.0EveryoneToolsMarch 27, 20183.12.20.00.3010301.707570
76490MbH BMMEDICAL1.012.3100.0Free0.0EveryoneMedicalDecember 14, 20161.1.34.30.00.3010302.004321
87926Tech CU Card ManagerFINANCE1.027.21000.0Free0.0EveryoneFinanceJuly 25, 20171.0.14.00.00.4771213.000434
910591Lottery Ticket Checker - Florida Results & LottoTOOLS1.0341.0500.0Free0.0EveryoneToolsDecember 12, 20171.04.20.00.6020602.699838

Last rows

df_indexAppCategoryRatingReviewsSizeInstallsTypePriceContent RatingGenresLast UpdatedCurrent VerAndroid VerEarningsReviews_logInstalls_log
1083010349Santa Fe ThriveHEALTH_AND_FITNESS5.028.350.0Free0.00EveryoneHealth & FitnessJuly 9, 20184.2.24.10.00.4771211.707570
108316783Wifi BT ScannerFAMILY5.021.2500.0Free0.00EveryoneEducationNovember 3, 20161.04.0.30.00.4771212.699838
1083210629Florida WildflowersFAMILY5.0569.01000.0Free0.00EveryoneEducationJuly 10, 20171.54.10.00.7781513.000434
108331030ProsperityEVENTS5.0162.3100.0Free0.00EveryoneEventsJuly 9, 20181.142.00.01.2304492.004321
108345260AJ Gray Dark Icon PackPERSONALIZATION5.0235.010.0Paid0.99EveryonePersonalizationApril 29, 20181.14.19.90.4771211.041393
108357164CD CHOICE TUBEFAMILY5.0105.8500.0Free0.00EveryoneEntertainmentJuly 23, 20170.0.44.10.01.0413932.699838
108365263AJ Blue Icon PackPERSONALIZATION5.0431.050.0Paid0.99EveryonePersonalizationApril 27, 20181.14.149.50.6989701.707570
108376363Read it easy for BKLIFESTYLE5.013.250.0Free0.00EveryoneLifestyleJuly 15, 20181.24.10.00.3010301.707570
108387697CP Installer AppBUSINESS5.0424.0100.0Free0.00EveryoneBusinessJuly 24, 20185.1.14.10.00.6989702.004321
108399039Chronolink DXFAMILY5.0773.010.0Paid0.99EveryonePuzzleJuly 6, 20171.24.19.90.9030901.041393